[SYSTEMDS-3930] OOCEvictionManager implementation#2343
[SYSTEMDS-3930] OOCEvictionManager implementation#2343j143 wants to merge 9 commits intoapache:mainfrom
Conversation
|
Hi @mboehm7 - could you please check if this is the implementation that's on your mind? <the tests run fine on local, checking why it's failing on github actions> |
mboehm7
left a comment
There was a problem hiding this comment.
Thanks for the patch @j143. Overall, this eviction manager looks already very good. Once the minor comments are addressed we can already merge it in, and after some preliminary experiments than do the major performance improvements incrementally (e.g., not writing individual blocks but configurable partitions of blocks).
| LocalFileUtils.createLocalFileIfNotExist(_spillDir); | ||
| } | ||
|
|
||
| public static synchronized OOCEvictionManager getInstance() { |
There was a problem hiding this comment.
you can simply make the methods static synchronized, and remove this getInstance method (right now every stream would work with the common object but use non-synchronized methods, and thus, are not thread-safe)
| /** | ||
| * Store a block in the OOC cache (serialize once) | ||
| */ | ||
| public void put(String key, IndexedMatrixValue value) throws IOException { |
There was a problem hiding this comment.
maybe use stream ID instead of the string key
| long size = estimateSerializedSize(mb); | ||
| ByteBuffer bbuff = new ByteBuffer(size); |
There was a problem hiding this comment.
Since you are deadline with many blocks, I would recommend to keep only shallow copies of the individual blocks without any serialization (which might increase the size but reduce the overhead)
|
|
||
| // Spill to disk | ||
| String filename = _spillDir + "/" + key; | ||
| bbuff.evictBuffer(filename); |
There was a problem hiding this comment.
put a TODO in here, that we should improve the evictBuffer to avoid storing a file a second time if it already exists (I would have to double check if we correctly do that in our normal lazywritebuffer).
| return mb.getExactSerializedSize(); | ||
| } | ||
|
|
||
| private MatrixIndexes parseIndexesFromKey(String key) { |
There was a problem hiding this comment.
let's avoid this string concatenation and parsing by simply using the stream ID and blockID as is.
src/main/java/org/apache/sysds/runtime/instructions/ooc/ResettableStream.java
Outdated
Show resolved
Hide resolved
src/main/java/org/apache/sysds/runtime/instructions/ooc/ResettableStream.java
Outdated
Show resolved
Hide resolved
src/main/java/org/apache/sysds/runtime/instructions/ooc/ResettableStream.java
Outdated
Show resolved
Hide resolved
src/main/java/org/apache/sysds/runtime/instructions/ooc/ResettableStream.java
Outdated
Show resolved
Hide resolved
src/main/java/org/apache/sysds/runtime/instructions/ooc/ResettableStream.java
Outdated
Show resolved
Hide resolved
|
Hi @mboehm7 , I have addressed most of the comments. The tests are failing in GitHub actions - I tried to test with similar memory configuration in my local and couldn't reproduce. |
|
Maybe test your eviction logic with very small buffer pool size (e.g., 4KB) in order to force eviction. The differences in results indicate that as soon as we're evicting we produce incorrect results. |
|
Now, I am able to reproduce this issue in a linux machine - it's retrieving incorrect blocks due to index mismatch. checking the root cause further. _size + requiredSize: 4000009 + 4000009; _limit: 4718 bytes I have decreased the buffer limit still the same. I checked that the eviction logic is triggered. |
|
alright, I'll have a look either today or tomorrow. |
Implemented a base OOC Eviction Manager & used it in ResettableStream instead of _cache.